NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

EvidenceBot: A Privacy-Preserving, Customizable RAG-Based Tool for Enhancing Large Language Model Interactions

https://doi.org/10.1145/3696630.3728607

Khan, Nafiz Imtiaz; Filkov, Vladimir (June 2025, Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering)

Large Language Models (LLMs) have become pivotal in reshaping the world by enabling advanced natural language processing tasks such as document analysis, content generation, and conversational assistance. Their ability to process and generate human-like text has unlocked unprecedented opportunities across different domains such as healthcare, education, finance, and more. However, commercial LLM platforms face several limitations, including data privacy concerns, context size restrictions, lack of parameter configurability, and limited evaluation capabilities. These shortcomings hinder their effectiveness, particularly in scenarios involving sensitive information, large-scale document analysis, or the need for customized output. This underscores the need for a tool that combines the power of LLMs with enhanced privacy, flexibility, and usability. To address these challenges, we present EvidenceBot, a local, Retrieval-Augmented Generation (RAG)-based solution designed to overcome the limitations of commercial LLM platforms. Evidence-Bot enables secure and efficient processing of large document sets through its privacy-preserving RAG pipeline, which extracts and appends only the most relevant text chunks as context for queries. The tool allows users to experiment with hyperparameter configurations, optimizing model responses for specific tasks, and includes an evaluation module to assess LLM performance against ground truths using semantic and similarity-based metrics. By offering enhanced privacy, customization, and evaluation capabilities, EvidenceBot bridges critical gaps in the LLM ecosystem, providing a versatile resource for individuals and organizations seeking to leverage LLMs effectively.
more » « less
Free, publicly-accessible full text available June 23, 2026
Open-Source LLMs for Technical Q&A: Lessons from StackExchange

Babar, Zeerak; Khan, Nafiz Imtiaz; Hassnain, Muhammad; Filkov, Vladimir (May 2025, 2025 International Conference on Software Engineering of Emerging Technologies (SEET-25))

In the rapidly evolving domain of software engineering (SE), Large Language Models (LLMs) are increasingly leveraged to automate developer support. Open source LLMs have grown competitive with pro- prietary models such as GPT-4 and Claude-3, without the associated financial and accessibility constraints. This study investigates whether state of the art open source LLMs including Solar-10.7B, CodeLlama-7B, Mistral-7B, Qwen2-7B, StarCoder2-7B, and LLaMA3-8B can generate responses to technical queries that align with those crafted by human experts. Leveraging retrieval augmented generation (RAG) and targeted fine tuning, we evaluate these models across critical performance dimen- sions, such as semantic alignment and contextual fluency. Our results show that Solar-10.7B, particularly when paired with RAG and fine tun- ing, most closely replicates expert level responses, o!ering a scalable and cost e!ective alternative to commercial models. This vision paper high- lights the potential of open-source LLMs to enable robust and accessible AI-powered developer assistance in software engineering.
more » « less
Free, publicly-accessible full text available May 23, 2026
From Models to Practice: Enhancing OSS Project Sustainability with Evidence-Based Advice

https://doi.org/10.1145/3663529.3663777

Khan, Nafiz Imtiaz; Filkov, Vladimir (July 2024, ACM)

Full Text Available

Search for: All records